Computer vision - Lab 6

Agenda

Image segmentation based on:

Helpers

Images

Visualization

Image segmentation

Segmentation through thresholding

The easiest way to segment an image is to perform a pixel intensity thresholding. The thresholding operation consists in replacing all intensities above a certain threshold with a certain constant value, and with another value below this threshold.

There is also segmentation with multiple thresholds, which was presented in the first class.

The OpenCV library also includes a ready-made implementation of other simple image thresholding approaches. To perform the thresholding operation in OpenCV, the threshold() function should be called, which takes the image, the threshold value, the maximum value and the threshold method that should be used.

The available thresholding methods include:

Sudoku example

OTSU

OTSU is an algorithm that adaptively selects the threshold value so that the intensities of both new (binarized) classes have the intrinsic lowest pixel intensity variance (which is equivalent to maximizing the inter-class variance).

The minimized function by OTSU is: $$\sigma^2_w(t) = Q_1\sigma^2_1 + Q_2\sigma^2_2$$ where:

The first step will be to calculate the probability of a pixel of a given intensity (from 0 to 255). To find such, it is enough to determine the histogram and normalize it. Additionally, the cumulative distribution function will be calculated for the purpose of calculating the mean and conditional variance.

A single iteration (for a certain set threshold) of the OTSU algorithm consists in dividing the probability distribution into the probabilities of occurrence of pixels of both classes. To determine the mean value, it is necessary to use the formula for the conditional expected value (because we calculate the average of the pixels under the condition that a certain class occurs, or in other words: ''expected value for a given class'').

$$ E(C_1 | t_1) = \sum_{x =1}^{t_1} \frac{xP(x)}{Q_1}$$

Then, to calculate the (conditional!) Variances, we use the previously calculated conditional expected values. $$\sigma^2_1 = \sum_{x =1}^{t_1} \frac{(x - E[C_1|t_1])^2P(x)}{Q_1}$$

In the most basic version, the OTSU algorithm, we iterating over all possible divisions (threshold from 1 to 254) and selecting the one for which the previously presented objective function returns the smallest value.

In order to use the OSTU method in opencv, just add THRESH_OTSU to the thresholding operation.

Adaptive methods

Among the thresholding methods, there are also adaptive methods. These are thresholding methods that adjust the threshold value depending on the image content.

Adaptive thresholding methods often work very well when the input image is divided into smaller areas and the threshold value is adjusted separately for each area. The motivation behind such a mechanism is the fact that in real images the lighting (as well as focus, balance, etc.) is uneven.

Segmentation of multi-channel images

Segmentation of multi-channel images (e.g. RGB) by simple thresholding methods becomes problematic due to the need to define thresholds in N-dimensional space. Therefore, instead of simple thresholding and segmentation of multi-channel images, cluster analysis methods are more often used.

Cluster analysis is about finding clusters of pixels in a certain space (even directly in the intensity space!) And creating a separate pixel class in that space.

For the image below, let's perform a simple pixel intensity analysis.

The image presented as a list of pixels (BGR) was displayed in 3D space, where the coordinates of a given pixel are its intensity values. Additionally, the pixels have been colored according to their intensities.

From the analysis of the above visualization, a few conclusions can be drawn:

One method of splitting spaces into clusters is Gaussian Mixture, which approximates the distribution of clusters using N Gaussian distributions. It is a method with training parameters, so there is a need for a certain sample of data to which we could fit a mathematical model.

The finished implementation of the algorithm is in the Scikit library.

We will use a pixel list (BGR) as the data to which we will fit the model, and then assign each of them a number of Gaussian systems to which it belongs with the highest probability.

The next step will be to calculate the average color for each class (segment) and display the pixels again with the colors representing the segmentation.

Pixels with mapped classes are the final image segmentation based on cluster analysis. The input image and the segmentation effect are shown below.

Segmentation by edge detection

Segmentation through edge detection is based on the knowledge learned in the previous class as part of detecting key points, corners and edges.

The idea is to divide the image on the basis of edges, and then fill closed areas, assigning subsequent identifiers to subsequent separable areas.

The assignment of identifier areas is presented in the next section.

Segmentation by region growing

Segmentation by region growing consists in iteratingly joining adjacent areas until a certain condition is met. The areas are joined after meeting the uniformity test, while the algorithm is executed until the stop condition is met.

Uniformity test - combining areas and checking a certain condition. An example condition might be: the average difference of pixel intensity in both areas. If it is greater than a certain threshold, then the areas are not uniform and there is no connection.

Stop condition - this condition can be treated as no further merges of areas or as a condition for early stopping the algorithm (e.g. when we want the areas not to be larger than the set limit.

Tasks

Task 1

Like the section on multi-channel image segmentation, perform the same pixel intensity cluster analysis for the './skittles.jpg' image and then segment the image using the K-Means algorithm (available, among others, in the scikit library: sklearn.cluster.KMeans) .

Present the intermediate results:

Task 2

Using the methods you learned in the previous class, find the number of Skittles in the image './skittles.jpg'. (it is not necessary to use the solution from task 1) Show intermediate results and describe the processing steps in the comment.

Show the original image with founded individual skittles marked on it.

Task 3

  1. Test the solution from task 2 for the remaining skittels images.
  2. Improve the solution so that it works properly for this images.